Robust Phoneme Recognition Using High Resolution Temporal Envelopes

نویسندگان

  • Sriram Ganapathy
  • Hynek Hermansky
چکیده

Frequency domain linear prediction (FDLP) is a technique for auto-regressive (AR) modeling of Hilbert envelopes of the signal. The model is derived by the application of linear prediction on the discrete cosine transform (DCT) of the signal. In this paper, we propose modifications of the basic FDLP approach for deriving high resolution envelopes. We determine various factors which affect temporal resolution in FDLP such as the location of the input peaks within the analysis segment, type of window applied in the DCT of the signal, and order of the FDLP model. This analysis enables us to improve the resolution of temporal envelopes derived from FDLP. The features extracted from high resolution envelopes outperform MFCC features in noisy phoneme recognition experiments (relative improvements of 10 %) and phoneme recognition in conversational telephone speech (relative improvements of 5 %).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Temporal resolution analysis in frequency domain linear prediction.

Frequency domain linear prediction (FDLP) is a technique for auto-regressive modeling of Hilbert envelopes. In this letter, the resolution properties of the FDLP model are investigated using synthetic signals with impulses immersed in noise. The effect of various factors are studied which affect the temporal resolution and this analysis suggests ways to improve the resolution of the FDLP envelo...

متن کامل

Hilbert envelope based spectro-temporal features for phoneme recognition in telephone speech

In this paper, we present a spectro-temporal feature extraction technique using sub-band Hilbert envelopes of relatively long segments of speech signal. Hilbert envelopes of the sub-bands are estimated using Frequency Domain Linear Prediction (FDLP). Spectral features are derived by integrating the sub-band Hilbert envelopes in short-term frames and the temporal features are formed by convertin...

متن کامل

Temporal envelope compensation for robust phoneme recognition using modulation spectrum.

A robust feature extraction technique for phoneme recognition is proposed which is based on deriving modulation frequency components from the speech signal. The modulation frequency components are computed from syllable-length segments of sub-band temporal envelopes estimated using frequency domain linear prediction. Although the baseline features provide good performance in clean conditions, t...

متن کامل

Static and dynamic modulation spectrum for speech recognition

We present a feature extraction technique based on static and dynamic modulation spectrum derived from long-term envelopes in sub-bands. Estimation of the sub-band temporal envelopes is done using Frequency Domain Linear Prediction (FDLP). These sub-band envelopes are compressed with a static (logarithmic) and dynamic (adaptive loops) compression. The compressed sub-band envelopes are transform...

متن کامل

Modulation frequency features for phoneme recognition in noisy speech.

In this letter, a new feature extraction technique based on modulation spectrum derived from syllable-length segments of subband temporal envelopes is proposed. These subband envelopes are derived from autoregressive modeling of Hilbert envelopes of the signal in critical bands, processed by both a static (logarithmic) and a dynamic (adaptive loops) compression. These features are then used for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012